Boosted Beta Regression
نویسندگان
چکیده
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
منابع مشابه
Boosted Regression (Boosting): An introductory tutorial and a Stata plugin
Boosting, or boosted regression, is a recent data mining technique that has shown considerable success in predictive accuracy. This article gives an overview over boosting and introduces a new Stata command, boost, that implements the boosting algorithm described in Hastie et al. (2001, p. 322). The plugin is illustrated with a Gaussian and a logistic regression example. In the Gaussian regress...
متن کاملAdenovector Mediated Inflammation in Mice Liver Boosted by Hydrodynamic Injection
Background and Aims: Adenovector gene transfer induces inflammatory response that finally leads to vector removal from delivered site. The effect of hydrodynamic pre-injection on Adenovector mediated liver inflammation remained elusive. Materials and Methods: Different mice groups were pre-treated by hydrodynamic saline, dexamethasone or nothing prior to Adeno-Luciferase administration. Their s...
متن کاملEvaluating Hospital Case Cost Prediction Models Using Azure Machine Learning Studio
Ability for accurate hospital case cost modelling and prediction is critical for efficient health care financial management and budgetary planning. A variety of regression machine learning algorithms are known to be effective for health care cost predictions. The purpose of this experiment was to build an Azure Machine Learning Studio tool for rapid assessment of multiple types of regression mo...
متن کاملBoosted trees for ecological modeling and prediction.
Accurate prediction and explanation are fundamental objectives of statistical analysis, yet they seldom coincide. Boosted trees are a statistical learning method that attains both of these objectives for regression and classification analyses. They can deal with many types of response variables (numeric, categorical, and censored), loss functions (Gaussian, binomial, Poisson, and robust), and p...
متن کاملA Statistical Comparison of Classification Algorithms on a Single Data Set
This research uses four classification algorithms in standard and boosted forms to predict members of a class for an online community. We compare two performance measures, area under the ROC (Receiver Operating Characteristic) curve (AUC) and accuracy in the standard and boosted forms. The research compares four popular algorithms Bayes, logistic regression, J48 and Nearest Neighbor (NN). The a...
متن کامل